87 research outputs found

    RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language.

    Get PDF
    Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]

    Probabilistic Graphical Model Representation in Phylogenetics

    Get PDF
    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (1) reproducibility of an analysis, (2) model development and (3) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and non-specialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution

    Inferring the demographic history of the North American firefly Photinus pyralis

    Get PDF
    The firefly Photinus pyralis inhabits a wide range of latitudinal and ecological niches, with populations living from temperate to tropical habitats. Despite its broad distribution, its demographic history is unknown. In this study, we modelled and inferred different demographic scenarios for North American populations of P. pyralis, which were collected from Texas to New Jersey. We used a combination of ABC techniques (for multi-population/colonization analyses) and likelihood inference (dadi, StairwayPlot2, PoMo) for single-population demographic inference, which proved useful with our RAD data. We uncovered that the most ancestral North American population lays in Texas, which further colonized the Central region of the US and more recently the North Eastern coast. Our study confidently rejects a demographic scenario where the North Eastern populations colonized more southern populations until reaching Texas. To estimate the age of divergence between of P. pyralis, which provides deeper insights into the history of the entire species, we assembled a multi-locus phylogenetic data covering the genus Photinus. We uncovered that the phylogenetic node leading to P. pyralis lies at the end of the Miocene. Importantly, modelling the demographic history of North American P. pyralis serves as a null model of nucleotide diversity patterns in a widespread native insect species, which will serve in future studies for the detection of adaptation events in this firefly species, as well as a comparison for future studies of other North American insect taxa

    Non-monophyly and intricate morphological evolution within the avian family Cettiidae revealed by multilocus analysis of a taxonomically densely sampled dataset

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The avian family Cettiidae, including the genera <it>Cettia</it>, <it>Urosphena</it>, <it>Tesia</it>, <it>Abroscopus </it>and <it>Tickellia </it>and <it>Orthotomus cucullatus</it>, has recently been proposed based on analysis of a small number of loci and species. The close relationship of most of these taxa was unexpected, and called for a comprehensive study based on multiple loci and dense taxon sampling. In the present study, we infer the relationships of all except one of the species in this family using one mitochondrial and three nuclear loci. We use traditional gene tree methods (Bayesian inference, maximum likelihood bootstrapping, parsimony bootstrapping), as well as a recently developed Bayesian species tree approach (*BEAST) that accounts for lineage sorting processes that might produce discordance between gene trees. We also analyse mitochondrial DNA for a larger sample, comprising multiple individuals and a large number of subspecies of polytypic species.</p> <p>Results</p> <p>There are many topological incongruences among the single-locus trees, although none of these is strongly supported. The multi-locus tree inferred using concatenated sequences and the species tree agree well with each other, and are overall well resolved and well supported by the data. The main discrepancy between these trees concerns the most basal split. Both methods infer the genus <it>Cettia </it>to be highly non-monophyletic, as it is scattered across the entire family tree. Deep intraspecific divergences are revealed, and one or two species and one subspecies are inferred to be non-monophyletic (differences between methods).</p> <p>Conclusions</p> <p>The molecular phylogeny presented here is strongly inconsistent with the traditional, morphology-based classification. The remarkably high degree of non-monophyly in the genus <it>Cettia </it>is likely to be one of the most extraordinary examples of misconceived relationships in an avian genus. The phylogeny suggests instances of parallel evolution, as well as highly unequal rates of morphological divergence in different lineages. This complex morphological evolution apparently misled earlier taxonomists. These results underscore the well-known but still often neglected problem of basing classifications on overall morphological similarity. Based on the molecular data, a revised taxonomy is proposed. Although the traditional and species tree methods inferred much the same tree in the present study, the assumption by species tree methods that all species are monophyletic is a limitation in these methods, as some currently recognized species might have more complex histories.</p

    Polymorphism‐aware estimation of species trees and evolutionary forces from genomic sequences with RevBayes

    Get PDF
    Funding: Funding information Austrian Science Fund, Grant/Award Number: P34524-B; Biotechnology and Biological Sciences Research Council, Grant/Award Number: BB/W000768/1; Deutsche Forschungsgemeinschaft, Grant/Award Number: HO 6201/1-1; Vienna Science and Technology Fund, Grant/Award Number: MA016-061.1. The availability of population genomic data through new sequencing technologies gives unprecedented opportunities for estimating important evolutionary forces such as genetic drift, selection and mutation biases across organisms. Yet, analytical methods that can handle polymorphisms jointly with sequence divergence across species are rare and not easily accessible to empiricists. 2. We implemented polymorphism-aware phylogenetic models (PoMos), an alternative approach for species tree estimation, in the Bayesian phylogenetic software RevBayes. PoMos naturally account for incomplete lineage sorting, which is known to cause difficulties for phylogenetic inference in species radiations, and scale well with genome-wide data. Simultaneously, PoMos can estimate mutation and selection biases. 3. We have applied our methods to resolve the complex phylogenetic relationships of a young radiation of Chorthippus grasshoppers, based on coding sequences. In addition to establishing a well-supported species tree, we found a mutation bias favouring AT alleles and selection bias promoting the fixation of GC alleles, the latter consistent with GC-biased gene conversion. The selection bias is two orders of magnitude lower than genetic drift, validating the critical role of nearly neutral evolutionary processes in species radiation. 4. PoMos offer a wide range of models to reconstruct phylogenies and can be easily combined with existing models in RevBayes—for example, relaxed clock and divergence time estimation—offering new insights into the evolutionary processes underlying molecular evolution and, ultimately, species diversification.Publisher PDFPeer reviewe

    P\u3csup\u3e3\u3c/sup\u3e: Phylogenetic posterior prediction in RevBayes

    Get PDF
    © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. Tests of absolute model fit are crucial in model-based inference because poorly structured models can lead to biased parameter estimates. In Bayesian inference, posterior predictive simulations can be used to test absolute model fit. However, such tests have not been commonly practiced in phylogenetic inference due to a lack of convenient and flexible software. Here, we describe our newly implemented tests of model fit using posterior predictive testing, based on both data- and inference-based test statistics, in the phylogenetics software RevBayes. This new implementation makes a large spectrum of models available for use through a user-friendly and flexible interface

    MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space

    Get PDF
    Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site dN/dS rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software
    corecore